fix(flow): apply deferred narrows after antecedent resolution#989
fix(flow): apply deferred narrows after antecedent resolution#989lewis6991 wants to merge 4 commits intoEmmyLuaLs:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical recursion bug within the type inference system, specifically affecting correlated @return_overload narrowing. By introducing a guard mechanism, it prevents infinite loops and stack overflows during complex type analysis, significantly enhancing the stability and reliability of the semantic model generation without altering the core inference logic for valid cases. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly identifies and fixes a potential stack overflow issue caused by recursive re-entry in correlated @return_overload narrowing. The approach of adding a guard to LuaInferCache is sound. The added regression test is also a great way to ensure this issue does not reappear. I have one suggestion to improve the robustness of the re-entrancy guard implementation.
crates/emmylua_code_analysis/src/semantic/infer/narrow/condition_flow/correlated_flow.rs
Outdated
Show resolved
Hide resolved
efa1423 to
81ac6be
Compare
81ac6be to
231495e
Compare
231495e to
efc8d41
Compare
6d04cf4 to
bf14cab
Compare
This comment was marked as resolved.
This comment was marked as resolved.
bf14cab to
2054627
Compare
3fa1774 to
7ad7200
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request significantly refactors the control flow analysis to defer type narrowing. By collecting narrowing actions and applying them after resolving the antecedent type, you've addressed potential deep recursion issues and improved the correctness of type inference, especially for complex scenarios involving stacked guards and reassignments. The changes are substantial and well-supported by an impressive suite of new regression tests, which gives great confidence in the robustness of this new approach. The implementation of pending narrows and the updated caching strategy are well-executed. Overall, this is a high-quality improvement to the type system's core logic.
7ad7200 to
e102572
Compare
| right_expr: LuaExpr, | ||
| condition_flow: InferConditionFlow, | ||
| ) -> Result<ResultTypeOrContinue, InferFailReason> { | ||
| ) -> Result<Option<ConditionFlowAction>, InferFailReason> { |
There was a problem hiding this comment.
Using "option" here instead of "continue" somewhat reduces readability.
There was a problem hiding this comment.
We use option here because try_get_at_eq_or_neq_expr has:
if let Some(action) = maybe_type_guard_binary_action(
db,
tree,
cache,
root,
var_ref_id,
flow_node,
left_expr.clone(),
right_expr.clone(),
condition_flow,
)? {
return Ok(action);
}Using continue would make this a little worse I think.
crates/emmylua_code_analysis/src/semantic/infer/narrow/condition_flow/index_flow.rs
Outdated
Show resolved
Hide resolved
e102572 to
d8e5882
Compare
| } | ||
|
|
||
| fn get_single_antecedent(tree: &FlowTree, flow: &FlowNode) -> Result<FlowId, InferFailReason> { | ||
| fn get_single_antecedent(_tree: &FlowTree, flow: &FlowNode) -> Result<FlowId, InferFailReason> { |
There was a problem hiding this comment.
also remove, Many places no longer use FlowTree and use only FlowNode — does that mean they've abandoned having multiple decision paths?
There was a problem hiding this comment.
Removed.
The multiple-path flow model is still intact. FlowAntecedent::Multiple is still represented in the CFG, expanded via get_multi_antecedents, and used at branch joins in get_type_at_flow. get_single_antecedent is only the linear-predecessor helper, so it intentionally rejects Multiple.
When walking backward through flow, collect condition narrows as pending actions instead of applying them while recursively querying antecedent types. The eager path mixed narrowing with antecedent resolution, so stacked guards could re-enter flow inference from inside condition evaluation and build deep recursive chains across repeated truthiness, type-guard, signature-cast, and correlated return-overload checks. Resolve the antecedent type first, then apply the pending narrows in reverse order. That keeps narrowing as a post-pass over an already-resolved input, avoids re-entering the same condition chain while answering the current flow query, and lets same-variable self/member guards wait until earlier guards have narrowed the receiver enough for reliable lookup. Key the flow cache by whether condition narrowing is enabled, and separate assignment source lookup from condition application. Reuse a narrowed source only when the RHS preserves that precision; broader RHS expressions fall back to the antecedent type with condition narrowing disabled so reassignment clears stale branch narrows, while exact literals and compatible partial table/object rewrites still preserve useful narrowing. Add regression coverage for stacked guards, correlated overload joins, pcall aliases, and assign/return diagnostics.
d8e5882 to
76a7a8e
Compare
|
Is there any performance degradation or improvement when processing a single large file—for example, a definition file with many assignment statements of about 10k lines? |
|
The motivation for this change was to improve performance by reducing recursion. The return_overload patch pushed the current flow analysis to its limits and I found a case where it caused indexing to hang. To be extra sure, I'll run this on a few projects and get back to you. |
|
...and also to fix a regression that prevents processing certain (i.e., my) projects ;) So the mentioned hang is not hypothetical. |
I often receive reports of performance issues from the language server on large configuration tables and on config files with many assignment definitions; I’m aware of it, but I haven’t gotten around to fixing it. |
|
No, this is not about a performance issue. This is a very specific regression introduced in #985 which makes analysis hang at 100% CPU for hours on nvim-treesitter. This PR fixes that regression so analysis completes quickly again. |
I did add a stress regression test for a large linear assignment-heavy file, |
|
I've tested indexing neovim, gitsigns and they index very quickly. I also tried my neovim config which includes all my plugins (~4000 files) and that indexed reasonably quickly too (3-6 seconds). |
|
The performance issues come from game development projects, where many projects have over ten thousand files and individual files are megabytes in size. |
|
Ok, then can you help me confirm whether this PR improves or decreases performance on such projects? I'm yet to see a project referenced in any issue. |
|
lua-bench.zip |
Thanks for this. We should probably add this to CI in some form so we can have some better control on performance regressions. The last few commits do improve analysis quite a bit, and that obviously come at some cost, though now we have a reference project that can use help guide optimizations. I'll check how this PR specially affects this workload. If it is the same or improves things then I hope we can merge this and add further optimizations in follow ups. |
|
lua-bench.zip does the same recursion hang on |
|
Latest release: 1.38s So it does slow things down quite a bit. I'll try and work on optimizing that. |
Add a dedicated condition-flow cache keyed by variable reference, antecedent flow, and branch polarity so stacked guards can reuse earlier results instead of re-entering the same narrowing path recursively. Only build correlated return-overload narrows when both the discriminant and target participate in multi-return tracking. That keeps ordinary same-variable truthiness and equality guards on the direct narrowing path while preserving correlated narrowing for real return-overload cases. Add a regression test for repeated `if not value then return end` guards so the semantic model still builds and the narrowed type remains `string`.
|
Added some caching. Ran a quick local Method:
On this workload, HEAD is about 2.5x faster than the release build. |
|
How is the memory usage? Has it increased significantly? |
|
I don't know. It's most likely going to be better then |

When walking backward through flow, collect condition narrows as pending
actions instead of applying them while recursively querying antecedent
types. The eager path mixed narrowing with antecedent resolution, so
stacked guards could re-enter flow inference from inside condition
evaluation and build deep recursive chains across repeated truthiness,
type-guard, signature-cast, and correlated return-overload checks.
Resolve the antecedent type first, then apply the pending narrows in
reverse order. That keeps narrowing as a post-pass over an
already-resolved input, avoids re-entering the same condition chain
while answering the current flow query, and lets same-variable
self/member guards wait until earlier guards have narrowed the receiver
enough for reliable lookup.
Key the flow cache by whether condition narrowing is enabled, and
separate assignment source lookup from condition application. Reuse a
narrowed source only when the RHS preserves that precision; broader RHS
expressions fall back to the antecedent type with condition narrowing
disabled so reassignment clears stale branch narrows, while exact
literals and compatible partial table/object rewrites still preserve
useful narrowing.
Add regression coverage for stacked guards, correlated overload joins,
pcall aliases, and assign/return diagnostics.
These added tests fail on
main: